Search CORE

DukeSpace

MDC Repository

Allelic Gene Structure Variations in Anopheles gambiae Mosquitoes

Author: AM McGuire
B Modrek
B Modrek
CE Pearson
DB Malko
DM Menge
E Birney
EC Swart
EV Kriventseva
F Oduol
G Dimopoulos
Guiyun Yan
H Nagasaki
H Ranson
J Li
J Sambrook
JC Venter
JM Johnson
JMC Ribeiro
Jose M. C. Ribeiro
Juan Valcarcel
Jun Li
L Zheng
LE Maquat
M Pombi
M Wang
MI McCarthy
MJ Gorman
ML Tress
MM Riehle
MM Riehle
NN Singh
P Early
PA Estes
PA Sharp
RA Holt
SD Schlueter
SM Gomez
TD Wu
V Nembaware
V Nembaware
W Gilbert
WH Majoros
Z Wang
Z Wang
Publication venue: Public Library of Science
Publication date: 01/05/2010
Field of study

Allelic gene structure variations and alternative splicing are responsible for transcript structure variations. More than 75% of human genes have structural isoforms of transcripts, but to date few studies have been conducted to verify the alternative splicing systematically.The present study used expressed sequence tags (ESTs) and EST tagged SNP patterns to examine the transcript structure variations resulting from allelic gene structure variations in the major human malaria vector, Anopheles gambiae. About 80% of 236,004 available A. gambiae ESTs were successfully aligned to A. gambiae reference genomes. More than 2,340 transcript structure variation events were detected. Because the current A. gambiae annotation is incomplete, we re-annotated the A. gambiae genome with an A. gambiae-specific gene model so that the effect of variations on gene coding could be better evaluated. A total of 15,962 genes were predicted. Among them, 3,873 were novel genes and 12,089 were previously identified genes. The gene completion rate improved from 60% to 84%. Based on EST support, 82.5% of gene structures were predicted correctly. In light of the new annotation, we found that approximately 78% of transcript structure variations were located within the coding sequence (CDS) regions, and >65% of variations in the CDS regions have the same open-reading-frame. The association between transcript structure isoforms and SNPs indicated that more than 28% of transcript structure variation events were contributed by different gene alleles in A. gambiae.We successfully expanded the A. gambiae genome annotation. We predicted and analyzed transcript structure variations in A. gambiae and found that allelic gene structure variation plays a major role in transcript diversity in this important human malaria vector

eScholarship - University of California

Improved annotation with de novo transcriptome assembly in four social amoeba species

Author: AJ Heidel
AM Waterhouse
B Langmead
B Ma
BJ Haas
C Burge
C Cole
C Gissi
C Schilde
C Trapnell
Christian Cole
Christina Schilde
CP Ponting
EM Quinn
F Jeanmougin
FA Simão
G Glöckner
G Rot
Geoffrey J. Barton
Gernot Glöckner
GS Slater
Hajara M. Lawal
I Letunic
JP Huelsenbeck
JW Nicol
KE Hayer
L Eichinger
LR Rabiner
M Felder
M Stanke
M Yandell
MA Hassan
MD Macmanes
MG Grabherr
MH Schulz
ML Metzker
NJ Schurch
Pauline Schaap
PSG Chain
R Piskol
R Smith-Unna
Reema Singh
RL Chisholm
RS Young
S Kumar
SF Altschul
T Steijger
TBK Reddy
TD Wu
TJ Jentsch
W Xue
WH Majoros
WJ Kent
YL Xie
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/05/2016
Field of study

Background: Annotation of gene models and transcripts is a fundamental step in genome sequencing projects. Often this is performed with automated prediction pipelines, which can miss complex and atypical genes or transcripts. RNA sequencing (RNA-seq) data can aid the annotation with empirical data. Here we present de novo transcriptome assemblies generated from RNA-seq data in four Dictyostelid species: D. discoideum, P. pallidum, D. fasciculatum and D. lacteum. The assemblies were incorporated with existing gene models to determine corrections and improvement on a whole-genome scale. This is the first time this has been performed in these eukaryotic species. Results: An initial de novo transcriptome assembly was generated by Trinity for each species and then refined with Program to Assemble Spliced Alignments (PASA). The completeness and quality were assessed with the Benchmarking Universal Single-Copy Orthologs (BUSCO) and Transrate tools at each stage of the assemblies. The final datasets of 11,315-12,849 transcripts contained 5,610-7,712 updates and corrections to >50% of existing gene models including changes to hundreds or thousands of protein products. Putative novel genes are also identified and alternative splice isoforms were observed for the first time in P. pallidum, D. lacteum and D. fasciculatum. Conclusions: In taking a whole transcriptome approach to genome annotation with empirical data we have been able to enrich the annotations of four existing genome sequencing projects. In doing so we have identified updates to the majority of the gene annotations across all four species under study and found putative novel genes and transcripts which could be worthy for follow-up. The new transcriptome data we present here will be a valuable resource for genome curators in the Dictyostelia and we propose this effective methodology for use in other genome annotation projects

Kölner UniversitätsPublikationsServer

Springer - Publisher Connector

University of Dundee Online Publications

Computational Analysis and Experimental Validation of Gene Predictions in Toxoplasma gondii

Author: A Khan
A. Khan TS
Andras Fiser
B Gajria
Carlos J. Madrid-Aliste
CG Elsik
D Xia
Dmitry Rykunov
DN Perkins
DS Roos
Edward Nieves
F Lu
Fa-Yun Che
I Korf
J.C. Kissenger KCH
JD Jaffe
JL Jones
Joseph M. Dybas
JR Radke
K Kim
Kami Kim
L Kall
Louis M. Weiss
MP Washburn
OE Sousa
P Nielsen
PJ Bradley
PR Jungblut
R Guigo
R Wang
RD Finn
Ruth Hogue Angeletti
S Fauquenoy
S Karlin
SF Altschul
SJ Sanderson
Steven L. Salzberg
TJ Stevens
VB Carruthers
W Li
WH Majoros
WR Bowie
XW Zhou
XW Zhou
Publication venue: Public Library of Science
Publication date: 09/12/2008
Field of study

Toxoplasma gondii is an obligate intracellular protozoan that infects 20 to 90% of the population. It can cause both acute and chronic infections, many of which are asymptomatic, and, in immunocompromised hosts, can cause fatal infection due to reactivation from an asymptomatic chronic infection. An essential step towards understanding molecular mechanisms controlling transitions between the various life stages and identifying candidate drug targets is to accurately characterize the T. gondii proteome.We have explored the proteome of T. gondii tachyzoites with high throughput proteomics experiments and by comparison to publicly available cDNA sequence data. Mass spectrometry analysis validated 2,477 gene coding regions with 6,438 possible alternative gene predictions; approximately one third of the T. gondii proteome. The proteomics survey identified 609 proteins that are unique to Toxoplasma as compared to any known species including other Apicomplexan. Computational analysis identified 787 cases of possible gene duplication events and located at least 6,089 gene coding regions. Commonly used gene prediction algorithms produce very disparate sets of protein sequences, with pairwise overlaps ranging from 1.4% to 12%. Through this experimental and computational exercise we benchmarked gene prediction methods and observed false negative rates of 31 to 43%.This study not only provides the largest proteomics exploration of the T. gondii proteome, but illustrates how high throughput proteomics experiments can elucidate correct gene structures in genomes

RNA-Seq improves annotation of protein-coding genes in the cucumber genome

Author: AL Price
B Haas
B Langmead
BJ Haas
BJ Haas
BJ Haas
BJ Haas
C Trapnell
C Trapnell
E Birney
EP Nawrocki
F Denoeud
G Parra
GKCo Scientists
H Tang
HA Lorenzi
I Korf
J Jurka
J Ling
Kui Lin
L Stein
LD Stein
M Guttman
M Stanke
M Suyama
MG Grabherr
MR Brent
O Gotoh
O Gotoh
O Jaillon
O Keller
P Rice
PE Larsen
Pengcheng Yan
R Li
RC Edgar
RD Morin
S Griffiths-Jones
S Guo
S Huang
S Hunter
S Ouyang
SA Filichkin
Sanwen Huang
SF Altschul
TM Lowe
TM Lowe
V Ter-Hovhannisyan
W Li
WH Majoros
WJ Kent
Z Wang
Z Xu
Zhangjun Fei
Zhen Li
Zhonghua Zhang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background As more and more genomes are sequenced, genome annotation becomes increasingly important in bridging the gap between sequence and biology. Gene prediction, which is at the center of genome annotation, usually integrates various resources to compute consensus gene structures. However, many newly sequenced genomes have limited resources for gene predictions. In an effort to create high-quality gene models of the cucumber genome (<it>Cucumis sativus </it>var. <it>sativus</it>), based on the EVidenceModeler gene prediction pipeline, we incorporated the massively parallel complementary DNA sequencing (RNA-Seq) reads of 10 cucumber tissues into EVidenceModeler. We applied the new pipeline to the reassembled cucumber genome and included a comparison between our predicted protein-coding gene sets and a published set. Results The reassembled cucumber genome, annotated with RNA-Seq reads from 10 tissues, has 23, 248 identified protein-coding genes. Compared with the published prediction in 2009, approximately 8, 700 genes reveal structural modifications and 5, 285 genes only appear in the reassembled cucumber genome. All the related results, including genome sequence and annotations, are available at <url>http://cmb.bnu.edu.cn/Cucumis_sativus_v20/</url>. Conclusions We conclude that RNA-Seq greatly improves the accuracy of prediction of protein-coding genes in the reassembled cucumber genome. The comparison between the two gene sets also suggests that it is feasible to use RNA-Seq reads to annotate newly sequenced or less-studied genomes.</p

Springer - Publisher Connector

Towards an Evolutionary Model of Transcription Networks

Author: A Gossler
A Ledgard
A Siepel
A Tanay
A Vailaya
AE Tsong
AI Su
AL Toth
Amos Tanay
AR Borneman
AT Kalinka
BB Tuch
Chieh-Chun Chen
D Kuo
D Xie
D Xie
Dan Xie
DC King
E Segal
E Segal
G Guo
G Konopka
G Kunarso
GA Wray
GP Wagner
GR Mishra
HE Hoekstra
I Holmes
I Tirosh
I Ulitsky
J Adjaye
J Cai
J Gerhart
J Ihmels
J Quackenbush
J Ruan
J Wang
JL Gordon
LD Ward
M Hasegawa
M King
MA Beer
MC Oldham
MJ Herrgard
MS Cline
O Cappâe
P Khaitovich
P Ray
R Lister
R Siddharthan
R Siddharthan
RM Ewing
S Marcellini
S Nikolaev
SB Carroll
Sheng Zhong
T Domazet-Loso
TC Doetschman
TL Bailey
V Mustonen
WH Majoros
X Chen
X He
X He
Xiaoyi Cao
Xin He
Y Gilad
Y Xing
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

DNA evolution models made invaluable contributions to comparative genomics, although it seemed formidable to include non-genomic features into these models. In order to build an evolutionary model of transcription networks (TNs), we had to forfeit the substitution model used in DNA evolution and to start from modeling the evolution of the regulatory relationships. We present a quantitative evolutionary model of TNs, subjecting the phylogenetic distance and the evolutionary changes of cis-regulatory sequence, gene expression and network structure to one probabilistic framework. Using the genome sequences and gene expression data from multiple species, this model can predict regulatory relationships between a transcription factor (TF) and its target genes in all species, and thus identify TN re-wiring events. Applying this model to analyze the pre-implantation development of three mammalian species, we identified the conserved and re-wired components of the TNs downstream to a set of TFs including Oct4, Gata3/4/6, cMyc and nMyc. Evolutionary events on the DNA sequence that led to turnover of TF binding sites were identified, including a birth of an Oct4 binding site by a 2nt deletion. In contrast to recent reports of large interspecies differences of TF binding sites and gene expression patterns, the interspecies difference in TF-target relationship is much smaller. The data showed increasing conservation levels from genomic sequences to TF-DNA interaction, gene expression, TN, and finally to morphology, suggesting that evolutionary changes are larger at molecular levels and smaller at functional levels. The data also showed that evolutionarily older TFs are more likely to have conserved target genes, whereas younger TFs tend to have larger re-wiring rates

CiteSeerX

Natural History Museum Repository

Genome of the facultative scuticociliatosis pathogen Pseudocohnilembus persalinus provides insight into its virulence through horizontal gene transfer

Author: A Bateman
A Bernsel
AH Futerman
AL Shug
AU Orjih
BJ Haas
BY Jee
C Trapnell
C Trapnell
E Neter
EH Lee
EL Sonnhammer
F de la Cruz
F Griffith
FR Evans
G Vonheijne
GE Baida
H Piccard
HH Niemann
I Azad
I Stojiljkovic
J Balla
JA Eisen
JA Sakanari
JG Songer
JL Lovett
JM Aury
JP Toutant
JS Seo
JT Jones
JY Song
L Leon-Rodriguez
L Leon-Rodriguez
L Puig
M Aikawa
M Sajid
M Stanke
M Syvanen
MG Grabherr
MG Low
MH Saier
MJ Stanhope
O Billker
P Moore
PB Crosbie
PJ Cheung
PJ de Wit
PJ Keeling
PS Coburn
R Banerjee
R Iglesias
R Pomp
R Taguchi
RC Edgar
RM Macnab
RW Titball
S Guindon
S Maere
S Weibo
SD Li
SM Kim
SM Kim
SP Shin
SRM Jones
SV Saveliev
T Akiba
TG Clark
TJ Egan
UL Rosewich
WA Hendrickson
WH Majoros
X Fan
X Pan
X Wang
YC Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/10/2015
Field of study

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ The attached file is the published version of the article

Institute of Hydrobiology, Chinese Academy Of Sciences

Comparative analysis of information contents relevant to recognition of introns in many species

Author: A Levine
AA Patel
AA Salamov
AE Vinogradov
C Burge
C Guthrie
CB Burge
CJ Langford
DL Black
DV Lu
E Pruesse
EV Koonin
G Ast
G Kol
GJ Goodall
GS Slater
H Banerjee
Hiroaki Iwata
J Felsenstein
J MacQueen
JA Berglund
JD Beggs
JM Izquierdo
JU Pontius
K Katoh
K Katoh
K Wiebauer
KL Fox-Walsh
L Collins
LP Lim
M Borodovsky
M Chen
M Davila Lopez
M Lynch
M Marz
N Sheth
NF Kaufer
NL Harris
O Gotoh
O Gotoh
Osamu Gotoh
P Puigbo
PA Sharp
PHA Sneath
PP Gardner
S Kullback
S Kullback
SH Schwartz
SL Salzberg
SP Lloyd
TR Gregory
V Anantharaman
V Brendel
V Douris
W Zhu
WH Majoros
Y Kapustin
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

The Viral and Cellular MicroRNA Targetome in Lymphoblastoid Cell Lines

Author: A Grimson
A Grundhoff
A Grundhoff
A Rodriguez
AB Rickinson
AK Lo
AR Marquitz
B Langmead
BP Lewis
Bryan R. Cullen
Christopher L. Frank
CL Jopling
D Nachmani
DA Thorley-Lawson
David L. Corcoran
DB Rosen
DD Jima
DL Corcoran
Dong Kang
DP Bartel
E Anastasiadou
E Gottwein
E Gottwein
E Gottwein
E Seto
E Vigorito
Eva Gottwein
EY Choy
F Grey
F Lu
F Osorio
F Petrocca
G Gatto
G Meister
G Xu
H Iizasa
Henri-Jacques Delecluse
HR Lin
IW Boss
J Jiang
J Mrazek
J Zhang
Jay A. Nelson
JE Cameron
JE Cameron
Jeffrey D. Nusbaum
JL Mott
JL Umbach
JL Zhao
JY Zhu
K Cosmopoulos
K Kapinas
K Okamura
KA O'Donnell
KJ Riley
L Dolken
L Li
M Hafner
M Rehmsmeier
M Selbach
Markus Hafner
Micah A. Luftig
MJ Allday
N Faumont
PA Nikitin
PP Medina
PS Eis
Q Yin
R Amoroso
R Feederle
R Feederle
R Morgan
RC Friedman
Rebecca L. Skalsky
Regina Feederle
RL Skalsky
RL Skalsky
RW Lung
S Barth
S Costinean
S Griffiths-Jones
S Pfeffer
SD Linnstaedt
SJ Chen
T Regad
T Xia
Thomas Tuschl
U Karvonen
U Santanam
Uwe Ohler
WH Majoros
X Cai
Y Kawahara
Y Pekarsky
Y Refaeli
Y Tay
Y Zhao
YT Lin
ZL Pratt
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Epstein-Barr virus (EBV) is a ubiquitous human herpesvirus linked to a number of B cell cancers and lymphoproliferative disorders. During latent infection, EBV expresses 25 viral pre-microRNAs (miRNAs) and induces the expression of specific host miRNAs, such as miR-155 and miR-21, which potentially play a role in viral oncogenesis. To date, only a limited number of EBV miRNA targets have been identified; thus, the role of EBV miRNAs in viral pathogenesis and/or lymphomagenesis is not well defined. Here, we used photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation (PAR-CLIP) combined with deep sequencing and computational analysis to comprehensively examine the viral and cellular miRNA targetome in EBV strain B95-8-infected lymphoblastoid cell lines (LCLs). We identified 7,827 miRNA-interaction sites in 3,492 cellular 3′UTRs. 531 of these sites contained seed matches to viral miRNAs. 24 PAR-CLIP-identified miRNA:3′UTR interactions were confirmed by reporter assays. Our results reveal that EBV miRNAs predominantly target cellular transcripts during latent infection, thereby manipulating the host environment. Furthermore, targets of EBV miRNAs are involved in multiple cellular processes that are directly relevant to viral infection, including innate immunity, cell survival, and cell proliferation. Finally, we present evidence that myc-regulated host miRNAs from the miR-17/92 cluster can regulate latent viral gene expression. This comprehensive survey of the miRNA targetome in EBV-infected B cells represents a key step towards defining the functions of EBV-encoded miRNAs, and potentially, identifying novel therapeutic targets for EBV-associated malignancies

MDC Repository

FigShare

Single nucleus genome sequencing reveals high similarity among nuclei of an endomycorrhizal fungus

Author: A Bairoch
A Conesa
A Gandolfi
A Gollotte
A Stamatakis
A Stamatakis
AJ Enright
AL Price
BJ Haas
BJ Haas
BJ Haas
C Angelard
C Sbrana
CJA Sigrist
CL Schoch
CS Campbell
D Croll
D Croll
D Redecker
D Redecker
Desheng Mu
DGO Saunders
Diane G. O. Saunders
Duur K. Aanen
E Boon
E Quevillon
E Tisserant
Erik Limpens
Erli Pang
ES Dolgin
F Martin
F Martin
F Martin
G Kuhn
G Parra
G Parra
G Sun
Geraldine Butler
H Shimodaira
H Stockinger
HC den Bakker
Huifen Cao
HW Mewes
Hwangho Cha
I Korf
I Stergiopoulos
IR Arkhipova
IR Sanders
J Jorda
J Jurka
J Kämper
J Lee
J Marleau
J Win
JE Galagan
Joe Win
K Katoh
KA Sędzielewska
KO Wrzeszczynski
Kui Lin
L Li
L-J Ma
M Cokol
M Hijri
M Magrane
M Parniske
M Stanke
MW Dimmic
N Corradi
N King
N Navin
Norbert de Ruijter
O Emanuelsson
O Gotoh
O Gotoh
P Horton
P Vandenkoornhuyse
Qian Zhou
R Apweiler
R Luo
R Riley
RA Dean
RC Edgar
René Geurts
RH Nilsson
Robin van Velzen
S Altschul
S Capella-Gutiérrez
S Halary
S Joneson
S Kloppholz
Sanwen Huang
Sergey Ivanov
SF Altschul
Sophien Kamoun
TA Torto
Tao Lin
TE Pawlowska
TH Eickbush
Ton Bisseling
Trupti Sharma
TY James
TY James
V Ter-Hovhannisyan
VCG Tagua
W Li
WH Majoros
Yi Shang
Ying Li
YJ Liu
Z Iqbal
Z Iqbal
Z Xu
Zhonghua Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

Nuclei of arbuscular endomycorrhizal fungi have been described as highly diverse due to their asexual nature and absence of a single cell stage with only one nucleus. This has raised fundamental questions concerning speciation, selection and transmission of the genetic make-up to next generations. Although this concept has become textbook knowledge, it is only based on studying a few loci, including 45S rDNA. To provide a more comprehensive insight into the genetic makeup of arbuscular endomycorrhizal fungi, we applied de novo genome sequencing of individual nuclei of Rhizophagus irregularis. This revealed a surprisingly low level of polymorphism between nuclei. In contrast, within a nucleus, the 45S rDNA repeat unit turned out to be highly diverged. This finding demystifies a long-lasting hypothesis on the complex genetic makeup of arbuscular endomycorrhizal fungi. Subsequent genome assembly resulted in the first draft reference genome sequence of an arbuscular endomycorrhizal fungus. Its length is 141 Mbps, representing over 27,000 protein-coding gene models. We used the genomic sequence to reinvestigate the phylogenetic relationships of Rhizophagus irregularis with other fungal phyla. This unambiguously demonstrated that Glomeromycota are more closely related to Mucoromycotina than to its postulated sister Dikarya